Languguage OS 2

home *** CD-ROM | disk | FTP | other *** search

/ Languguage OS 2 / Languguage OS II Version 10-94 (Knowledge Media)(1994).ISO / a_utils / _archvrs / unix / unzip51 / bugs.lon next >

Wrap

Text File | 1992-11-04 | 21.8 KB | 511 lines

Bugs (real and/or imagined): --------------------------- - funzip/more/decryption/no-echo bug: race condition(?) causes terminal to be "reset" to no-echo state - large file incorrectly compressed by PKZIP or decompressed by unzip [bottom of BUGS.long] - directory dates/times (special Unix perms?) not restored - NT: SetFileAttributes never called - NT: only FAT (and HPFS?) filesystem fully supported; VMS-like errors with NTFS (Scott Briggs, bottom of BUGS.long) - VMS docs out of date - Macintosh (100200), Atari (020000) external file attributes not interpreted correctly (both unzip and zipinfo) - pkbug error: zipfile with incorrect csize and/or ucsize--check for end of compressed (csize) data in uncompression routines: unimplod.c: while ((!zipeof) && ((outpos + outcnt) < ucsize)) { unreduce.c: while (((outpos + outcnt) < ucsize) && (!zipeof)) { [James Birdsall, Mark, bottom of BUGS.long] - if PK signature not found, append .zip and try again without error messages [Jean-loup, others, bottom of BUGS.long] - disk full: a few files clear some pointer; continuing beyond "Continue?" prompt, regardless of answer, kills unzip--stack too small? (doesn't seem to matter) Bug in MSC write() function? Subsequent write code isn't any different from -t option, so unlikely to be bug in uncompress routines... File descriptor bad/close() failure? (workaround: ^C at prompt) - textfile conversions on a PC system add extra CR to lines which already have CR/LF combo; other directions probably don't work, either (Mac/Unix/...): rewrite "dos2unix" and make general - fix "no errors detected" message for errors occurring *before* extract_or_ test_files(); count errors? differentiate between errors and warnings? - MSDOS/Windows: compatible with Win 3.1 only, not 3.0 - OS/2: directory EAs not restored if directory exists [Kai Uwe, KG27515@uark] (subsequent note: no way to determine which EAs are newer ==> cannot restore without user input) - VMS unzip no longer sets permissions correctly [should be fixed in 5.1d] Features (possible and/or definite): ----------------------------------- - ignore case for internal filename match on non-Unix systems, unless file- specs enclosed in single quotes - build in capability to check text/binary type and warn if -a (if version < 1.1 and not made on DOS--i.e., not early Info-ZIP versions) - allow wildcards in zipfile name (loop through each one) - add capability to extract to specified subdirectory [Kevin Fischer, 92.8.24] - add "near" to global vars [Steve Salisbury, 92.4.21] - construct CRC table dynamically? [Jean-loup, 92.5.12] - when listing filenames, use '?' for non-printables? [Thomas Wolff, 92.6.1] - modify to decompress input stream if part of a pipe, but continue using central directory if not (BIG job!)--extended local header capability - test/incorporate Martin Schulz optimization patch (still useful?) - add -oo option (overwrite and override): no user queries (if bad password, skip file; if disk full, take default action; if VMS special on non-VMS, unpack anyway; etc.) - add -Q[Q[Q]] option (quiet mode on comments, cautions, warnings and errors): forget -oo, or make synonym? Default level -Q? - rewrite mapname() - modify set_file_time routines to share common code (macro?) - add zipinfo "in-depth" option? (check local vs. central filenames, etc.) - create zipcat program to concatenate zipfiles - create zipfix program to rebuild/salvage damaged zipfiles - assembly-language routines? - CP/M version (Jeffery Foy) - VM/CMS version (Chua Kong Sian, others) - put man pages in more "proper" nroff format - add OS/2 .INF format helpfiles for UnZip and ZipInfo Additional thoughts/reports/arguments/issues (many VMS): ------------------------------------------------------- From: tp@mccall.com (Terry Poot) Newsgroups: comp.os.vms Subject: Re: Speeding up VAX C writes Date: 22 Oct 91 11:48:59 GMT In article <1991Oct21.130745.1018@lsl.co.uk>, paul@lsl.co.uk (Paul Hardy) writes: >Some months ago, I remember seeing an item here about how to speed up >writes from VAX C, in the case where the data being written was binary. Actually, the only trick I know about works no matter what the data is. I'm talking here about the normal C stream_lf files. You can use other RMS file types by specifying parameters to fopen, open, or creat, and they may be faster or slower depending on what you are doing, and how you are doing it. However, there is a way to speed up most I/O on stream files, in many cases dramatically. You want to add the argument "mbc=nnn" where nnn is the number of pages of memory you can spare for I/O buffering. mbc is the RMS field name for the Multi-Block Count. It tells the C rtl to use a buffer big enough to hold nnn blocks (blocks are the same size as pages for a little while longer at least, 512 bytes). Thus rather than reading or writing a block at a time, you can have it do it 200 blocks at a time, if you can spare 100Kb of memory, or any other value. (I'm sure there's an upper limit, but I don't know what it is. In practical terms, for most people it'll probably be PGFLQUO.) BTW, note that I said the C rtl and not RMS. C I/O uses RMS block mode I/O, which doesn't actually support mbc. It does, however, support whatever buffer size you choose, as long as it is an integral number of blocks. The designers of the RTL decided to check for mbc even on normal C stream files and allocate the buffer size accordingly (Thanks, guys!). This is why specifying mbf, the multi-buffer size, doesn't do anything. They didn't do multi-buffering, and RMS doesn't support that either for block mode I/O. (Anyone wanna submit an enhancement request SPR?) These little tidbits of info are courtesy one of the C RTL developers, from discussion after a session on C I/O at the DECUS symposium in Las Vegas. ------------------------------ From: techentin@Mayo.EDU (Bob Techentin) Newsgroups: comp.os.vms Subject: Re: Speeding up VAX C writes Paul Hardy, <paul@lsl.co.uk> asked: > Some months ago, I remember seeing an item here about how to speed up > writes from VAX C, in the case where the data being written was binary. > > We now have an application which is using fopen, then fseek and fwrite, > and is taking forever to do the writes. I'm sure that by putting the right > magic words in the fopen, we can cut out a lot of the unneccessary work > that the C RTL is doing on our behalf. The VAX C RTL function fopen() creates files that are stream-lf by default. This is not a natural kind of file for RMS, which prefers record oriented structures. You can use some non-portable options in the fopen() command to create record oriented files. The call: fp = fopen(filename, access, "ctx=rec"); will force the RTL to use record access for read or write - even on a streamlf file. This improves performance significantly. The call: fp = fopen(filename, "w", "rfm=var", "rat=cr"); will create a variable length record/carriage return carriage control file (instead of a stream-lf/no carriage control file), which is what your typical text editor creates. Read and write performance to record structure files is even better than just specifying record access. You can use other options documented in the VAX C Run-Time Library Reference Manual, creat() function. We set multibuffer counts, default extensions, and read-ahead and write-behind buffering to improve our file performance. Bob Techentin Internet: Techentin@Mayo.Edu Mayo Foundation, Rochester MN, 55905 USA (507) 284-2702 ------------------------------ From: Jean-loup Gailly <jloup@chorus.fr> Subject: Re: unzip 4.12/4.2 Date: Thu, 07 Nov 91 18:17:49 +0100 > - Speaking of EOLs, I completely ignored that whole contro- > versy, and I will continue to do so until someone can come > up with something which works and on which most of us can > agree. You have to wait until zip sets the ascii/binary flag (almost) correctly. At this point, you will be able to warn if -a is used on binary files and ask if -a should be ignored for such files (y/n/all). This proposal is simple and does not seem too controversial. ------------------------------ From: Jean-loup Gailly <jloup@chorus.fr> Subject: unzip problem Date: Thu, 23 Jan 92 09:17:03 +0100 The problem was that both files test and test.zip existed in the same directory. The command "unzip test" caused unzip to look at the file test instead of test.zip. I think I already suggested a long time ago that if unzip fails to find the PK signature in a zip file, it should check if a file with .zip appended also exists. ------------------------------ Date: Mon, 28 Jan 91 17:39:39 SST From: Chua Kong Sian <CCECKS%NUSVM.BITNET@CUNYVM.CUNY.EDU> Hi. I'm attempting to port UNZIP403 to the S/370 mainframe. In the PS: Is anybody else porting it to VM/CMS? Chua Kong Sian National University of Singapore ------------------------------ Date: Wed, 27 Feb 91 14:52:41 EST From: Michael Regoli <mr@ogre.cica.indiana.edu> Also, it may just be my software version, but there is no stdlib.h in the IBM distribution of BSD 4.3. I've always had to comment-out the #include statement from unzip.h to get it to compile. Since it was put under the "random extra stuff" perhaps most systems don't need it? ------------------------------ Date: Sun, 28 Apr 1991 12:19:32 PDT From: "Jeffery Foy" <foysys!jeffery@cs.washington.edu> Am wondering if anyone else is porting ZIP/Unzip to CP/M. I've been hacking at Unzip and actually have a working copy. Ah, but ZIP is ------------------------------ Date: Sun, 12 May 91 11:30:40 PDT From: madler@cobalt.cco.caltech.edu (Mark Adler) James Birdsall cautions: >> Warning! Do NOT trust PKUNZIP -t. I have a ZIP file which is badly >> corrupted -- missing about 25K from the end. PKUNZIP -t claims it is OK. Rather alarming. It is a problem, but it's not as bad as it sounds initially. I took a look at his zoo201.zip (which has the single entry zoo201.tar) and here's what I found: 1. Unzip fails with a crc error. 2. PKUNZIP succeeds, but the resulting file is too short (444416 compared to the 532480 reported in the zip file). 3. If I PKZIP that 444416 byte file, I get exactly zoo201.zip back. (Except the length entries are 444416 in the local and central headers--a total difference of four bytes between the two.) 4. Unzip thinks that new zip file is ok. 5. 444416 is multiple of 2048. Conclusions: 1. What's "wrong" in zoo201.zip is the length, not the data. 2. PKUNZIP doesn't notice this, since it is driven by the compressed "size" and not the uncompressed "length". 3. Unzip *does* notice because it is (apparently) driven by the length, and not the size. (I haven't looked into this yet.) 4. What must have happened is that there was an error reading the original 532480 byte file at the 444417th byte (probably the 869th 512-byte sector), but PKZIP thought is was just the end of file and stopped compressing. Bugs: 1. PKZIP doesn't handle read errors properly, and doesn't check the number of bytes read against the size of the file. (This is not a problem with Zip by the way, since it does look for read errors, and it sets the length from the number of bytes actually read from the file.) 2. PKUNZIP never looks at the length to see if it decompressed the right number of bytes--it only cares if the CRC is right. 3. Unzip doesn't stop when it runs out of compressed data (otherwise it would have gotten the right CRC, as it did with the re-PKZIPped version). Both PKUNZIP and Unzip should only decompress "size" bytes and check that "length" bytes were generated. Mark Adler madler@pooh.caltech.edu ------------------------------ Date: Sun, 12 May 91 13:57:15 PDT From: jwbirdsa@amc.com (James Birdsall) directory at the end of the file. The compressed size is accurate -- for the amount of data that was included. The uncompressed size is accurate for the original file. The only difference between the good file I made from I hypothesize that PKZIP, when creating the bad file, got a premature EOF when reading the source (which was one big tar file on an NFS-mounted drive). Assuming it was done, it proceded to button up the file as though ------------------------------ Date: Mon, 19 Aug 91 14:16:04 +0200 >From: Jean-loup Gailly <jloup@chorus.fr> But I still think that zip should make a little more effort to set the file type (ascii/binary) properly. It is impossible to get 100% accuracy, but that would help unzip to warn about -a used on binary files. I will try to do this for zip 1.1 if some kind soul is willing to adapt unzip to take advantage of this file type. Jean-loup ------------------------------ From stevesa@microsoft.com Tue Apr 21 20:28:05 1992 To: jloup@chorus.fr Cc: zip-bugs@CS.UCLA.EDU Subject: Re: UNZIP suggestion (Buffer Size flexibility) Date: Tue, 21 Apr 92 17:26:43 PDT Also, Jean-loup added the "near" keyword to all of ZIP 1.6's global variables (both the extern declarations and the actual definitions) which causes better code generation for Microsoft C in compact model. Shouldn't we do the same for UNZIP 5.X? ------------------------------ Date: Wed, 18 Mar 92 06:57:28 PST From: madler@cco.caltech.edu (Mark Adler) To: zip-bugs@CS.UCLA.EDU Subject: Re: yet another zip format extension As for other one-pass implementation details, I have given a little thought to that. Unzip should write an auxilary file as it's unzipping containing enough information to go back and change attributes, file names, and possibly do some file conversions, upon encountering the central directory. I'm figuring that it can just write all the entries to temporary file names at the base of where it's supposed to unzip to (or a specified temporary area), and then move, chmod, or convert files to where and what they're supposed to be later. When you get to the central directory, you start reading the auxilary file for the temporary names. We can also use the good old tempf routines to avoid writing small auxilary files. From: pdh@netcom.com (Phil Howard ) Newsgroups: comp.compression Subject: Re: Wanted: ZIP on VM/CMS Date: 20 May 92 22:31:24 GMT Organization: Netcom - Online Communication Services (408 241-9760 guest) lmfken@lmf.ericsson.se (Kenneth Osterberg) writes: >The subject says it all. Has anybody ported InfoZIP (both compression >and decompression) to the VM/CMS environment? Any estimates on the >work involved doing this? One of the problems is how to represent a byte-stream or byte-array file on VM/CMS. The conventional way is to store arbitrary sized pieces of the byte stream as individual variable length records in a type "V" file. The record boundaries are arbitrary amd reconstruction of the original stream is done simply by ignoring those boundaries. This is the way Kermit and FTP usually transfer data in binary mode. However it takes 2 bytes of allocated and addressable CMS file space to store those boundaries. Since the lengths are arbitrary, it is not possible to randomly address a specific byte of the stream itself, since you really have no way of knowing how many record boundaries preceed "stream byte N". You would need to know that to know what record number that byte is in. You could read the "V" file sequentially, but this is very inefficient and certainly to be avoided. The other method of representing a byte-stream or byte-array file is to use a type "F" file (fixed length records) with a record length of exactly 1. However there are disadvantages to this as well: 1. Even though Kermit and FTP can create and handle these files, it is not the default. 2. Many CMS tools read and write files 1 record at a time (instead of the more sophisticated and version sensitive method of reading and writing many records or whole blocks). This makes handling of such files very slow. Unzipping a 1 megabyte file might take 30 seconds whereas copying it to another minidisk could take 30 minutes. 3. Transferring such files across networks tends to expand on the space requirements. While in transit between VM/CMS systems, such files may take up to 3 times the normal space, apparently depending on the version of RSCS. 4. Possibly no or poor support in systems like IBM TSO and DEC VMS. I have in the past sent such files to a VMS host, which proceeded to store them internally with a CR/LF between each byte (it thought they were records) including deleting the bytes with values of either 0x40 (EBCDIC space) or 0x0d (CR). This resulted in an ambiguity making the original not reconstructable even if one were to write a tool to attempt it. Other representation ideas include: 1. Use larger fixed length records, with the byte count filled in as the first 4 bytes of the first block, or the first 4 bytes of the last block, or the last 4 bytes of the last block (block being the chosen and agreed on fixed record size, not the DASD blocksize). 2. Agree on a non-arbitrary record length for the type "V" representation originally described. Both of the above ideas would require changes to things like Kermit and FTP in order to support representing byte-stream data in that way. This would require convincing IBM in one case. Further complications to porting ZIP to VM/CMS include many aspects of the different character code in use (EBCDIC) and what to do with the files being added or extracted. One already sees the minor problem of extracting a text file from a ZIP archive on UNIX to find that it has these nasty carriage return thingys that stick "^M" on the end of all the lines when using the "less" program. It gets worse, MUCH worse, when dealing with VM/CMS (and I am sure TSO would be worse that CMS). I originally offered to port it, but because of the large amount of work it would require, and that at the time someone else had offered to do so, and that I didn't (and still don't) have a C compiler I feel I can work with (I really was considering translating it all to 370 assembler, which is a language I can program in as fast as, or faster than, C). Since there is no hint of a VM/CMS version of ZIP I suspect the other person gave up as well. I wouldn't blame them for it. A version of ARC was ported to VM/CMS. It required the archive be stored in a "F 512" file and it produced "F 512" files. Apparently it was ported from the CP/M version. I've used it (a program called "VMARC" is something else). ------------------------------ Date: Wed, 3 Jun 92 13:43:53 EDT From: davidsen@ariel.crd.ge.com (william E Davidsen) Subject: unz50i - DECLARE_ERRNO I would suggest that DECLARE_ERRNO be undefined by default, and that only those systems which fail to define it in the errno.h file turn it on. The current logic for MSC 6.0 and later doesn't seem to work for sco_dos compile, and I found errno properly defined without it on MIPS, Ultrix-VAX, Sun, HP-UX, Convex, etc. I think the exception case, a system with a broken errno.h, should be the exception rather than the default. ------------------------------ Date: Mon, 24 Aug 92 10:34:18 -0700 From: kfischer@SEAS.UCLA.EDU (Kevin J. Fischer) Subject: unzip suggestion I would suggest that you add the ability to extract to a specific path to the unzip program. This would be similar to how PKunzip does it: PKUNZIP [options] zipfile [d:outpath\] [file...] Notice that it allows you to specify the outpath before the list of files to be extracted. Date: Wed, 26 Aug 92 14:11:03 MDT From: "Piet W. Plomp" <piet@asterix.icce.rug.nl> Subject: unzip: extract to specified directory It is an option I use very much myself too and miss very much in unzip. So I do *STRONGLY* support the idea, and am willing to help in rewriting code for that purpose, although I don't have the time to do any coordination. [GRR: just check for trailing '/' and '\\'? ']' in VMS?] ------------------------------ Date: Thu, 3 Sep 92 11:30:55 EDT From: briggs@nashua.progress.com (Scott Briggs) Subject: Re: v31i133: zip19 - Info-ZIP portable Zip, version 1.9, Patch01 I have encountered the following problems with you zip and unzip when used under NT on a NTFS file system. 1) In the file FILEIO.C [zip] you have assumed under NT the 8.3 filename size. The structure DIR, field d_name should be changed to: char d_name[ MAX_PATH ]; // MAX_PATH=260 2) When zip'ing on an NTFS filesystem, it seems that the reported length of the file and that actual length of the file are not the same. I believe this is in FILEIO.C again, you've placed a #ifdef VMS around it and indicated that (VMS record lengths vary?), this may also be the case with NTFS, futher testing seems neccessary. The zip'ing fails when this occurs. 3) Unzip'ing has a similar problem to #2 when the unzip'ing occurs on an NTFS filesystem. 4) Unfortunately I can't recall the exact file, but I believe that you also set dosify=1 by default when zip and unzip are compiled on NT. Please remember that NT supports NTFS, FAT, and HPFS filesystems. ------------------------------ Date: Wed, 14 Oct 92 06:51:50 GMT From: James R. Van Artsdalen <james@bigtex.cactus.org> Subject: unzip 5.0 bug I have a file, which when compressed by either zip 1.9p1 (using -9) or pkzip 1.93a (using -ex), cannot be unzipped by unzip 5.0. unzip 5.0 also fails the CRC check on the zipped files when the "-t" option is used. pkunzip 1.93a fails the CRC test on the file compressed by zip 1.9p1; I don't know if this is a problem or not. pkunzip 1.93a does pass the CRC test on the file compressed by pkzip 1.93a. pkunzip 1.93a does unzip the pkzip'd file correctly. unzip 5.0 does test the zip 1.9p1-produced file correctly when the one of the options -0, -1, -2, -3, -4, -5, -7 or -8 is used with zip 1.9p1. The uncompressed file is ~ 38 MB. unix compress(1) gets ~ 18.8 MB, and pkzip 1.93a gets 13 MB with -ex. ------------------------------ ------------------------------ ------------------------------ ------------------------------